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I.  INTRODUCTION 

A.  GANPUI  is  an  acronym  for  "A  Generalized  Approach  to  New  Problems  in 
Ultrasonic  Inspection'.  Conceptually,  it  is  an  operator-computer  interactive 
scheme  that  involves  the  application  of  the  latest  techniques  in  ultrasonic 
inspection,  pattern  recognition,  and  minicomputer  technology.  The  following 
paragraphs  offer  a  brief  overview  of  GANPUI.  Detailed  descriptions  of  GANPUI 
components  will  comprise  the  main  body  of  the  text.  Sample  applications  of 
GANPUI  are  also  included. 

B.  The  entire  procedure  may  be  divided  into  three  general  categories:  input, 
processing,  <*nd  output. 

1.  The  input  consists  of  several  sub-processes,  the  first  of  which  is 
the  acquisition  of  reliable  ultrasonic  waveforms.  These  waveforms  are  then 
digitized,  that  is,  decomposed  into  discrete  time  sequences.  The  PDF  I  1 / 05 e 
minicomputer,  for  example,  accepts  these  sequences  and  performs  certain 
mathematical  operations  on  them,  such  as  determining  maximums  and  minimums. 

This  process  ts  known  as  feature  extraction.  Features  that  are  useful  in 
ultrasonic  examination  include  center  frequency,  6  JB  down  bandwidth,  energy 
ratios  over  specified  frequency  intervals  of  a  frequency  spectra,  etc.  At 
this  stage,  the  computer  operator  selects  the  particular  features  to  be 
employed  in  algorithm  development.  Once  the  feature  values  have  been  computed, 
they  are  stored  in  a  vector  filing  system.  This  completes  the  input  stage  of 
the  system. 

2.  Processing  utilizes  several  complex  schemes  or  algorithms  to  search 
for  innate  groupings  of  feature  data.  These  data  values  are  ordered  with 
respect  to  their  effect  in  defining  particular  groupings.  Upon  completion  of 
processing,  the  significant  features  are  combined  to  develop  a  classification 
s  c  heme . 

3.  GANPUI  outsat,  assists  the  ultrasonic  investigator  by  minimizing  the 
data  acquired  in  decision  making.  Algorithm  development  makes  use  ct  many 
techniques  in  learning  network  analysis  and  in  pattern  recognition,  the  goal 
being  to  establish  some  relationship  between  classification  mode  and  a  number 
of  important  ultrasonic  signal  features.  Regression  analysis  is  considered  at 
various  points  of  the  algorithm  development  process.  Such  techniques  as 
probability  density  function  analysis,  cluster  analysis,  minimum  distance 
classification,  adaptive  learning  polynomials,  and  a  Fisher  linear  discriminant 
are  currently  being  used  in  our  algorithm  development  test  system. 

C.  An  essential  element  in  the  GANPUI  program  of  study  is  associates  .  ith  tne 
utilization  of  good  training  data.  Good  test  samples  are  required  so  that  the 
computer  can  be  trained  to  recognize  certain  patterns.  Test  samples  arc 
obviously  required  to  evaluate  the  classification  algorithms  developed  by 
GANPUI . 

D.  Several  problems  that  are  being  studied  by  various  ultrasonic  research 
groups  that  make  use  oj  GANPUI  concepts  include  composite  material  inspection, 
aircraft  and  space  shuttle  adhesive  bond  evaluation,  the  detection  of  stress 
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corrosion  cracking  in  stainless  steel  piping  for  the  nuclear  Industry,  flaw 
growth  propagation  in  the  shipping  and  aircraft  industries,  and  the  early 
detection  of  breast  cancer  in  the  field  of  diagnostic  ultrasound. 

E.  A  flow  chart  of  GANPUI  is  shown  in  Table  1. 

Table  1.  GANPUI  Flow  Chart 
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11.  FOUR I LR  SERIES  AND  TRANSFORM  THEORY 

A.  FOURIER  DECOMPOSITION.  From  a  mat  in-mat  ical  viewpoint,  .my  ari-ilrar 
mathematical  function  could  he  resolved  or  decomposed  into  a  finite  t.u:.. 
some  other  functions. 

1.  For  example,  if  we  were  to  examine  the  mathematical  response  01  a 

system  from  a  rectangularly  shaped  input  function,  wt  could,  i‘  me*.  hemal icaliv 
convenient,  compute  the  response  from  two  rectangularly  shaped  ions  as 

fhown  in  Figure  1.  In  this  particular  example,  the  decomposition,  process  into 
two  ru<.  tungularly  shaped  functions  does  not  provide  us  with  any  additional 
information  or  mathematical  computational  efficiency.  The  rectangular  input 
pulse  given  above  could,  however,  he  decomposed  into  a  sum  of  some  other  wave¬ 
forms  ;  say  rer  angular  segments  of  fixed  pulse  duration  of  which  a  mathematical 
solution  might,  be  readily  available  in  the  literature  or  in  a  computer.  The 
total  solution  could  then  be  obtained  by  examining  the  contributions  from  each, 
smaller  segment  arid  add  ini  them  together  in  linear  fashion  to  obtain  total 

sol ut tons . 

2.  let  us  consider  now  a  different  foi  ir.  for  the  input  iuncti-  The  Dulse 
form  shown  below  could  be  treated  in  a  mat never  : cal  sense  as  a  finite  number 

of  rectangularly  shaped  input  pulse  1  or ns  us  illustrated  in  Figure  2.  In  the 
limiting  process,  as  the  pulse  Jurat  ion ,  & ,  o.  the  rectangularly  shaped  pulse 
decreases  to  some  Infinitesimal  value,  the  response  function  or  solution  to 
some  system  could  hr  identical. 

3.  A  more  useful  function  eeeomposit ion  approach  exists  in  a  mathematical 
sen  'e  i  - , v  chose  presented  above  ir.  the  rectangular  segment  approach.  This 
app'va  A-  called  frequency  analysis  or  Fourier  Series  analysis.  In  this 

par  r.  :.f.  .,v  approach,  a  a  e'er  fund  ional  shape  is  resolved  into  a  finite  number 
>:  -iinu-.oic.:l  waveforms.  As  an  example,  an  arbitrary  pulse  form  could  be 
•c  i  iidiu  e.i  mathe-iat  1  ea  i  ly  as  the  sum  illustrated  in  Figure  3;  the  functions 
■  town  •  r  ght  ..  ‘  .  representing  continuous  wave  or  sinusoidal  wave  'erms . 

\.»  tin.  rum!  or  o  waves  being  added  together  increases,  the  more  arc.:  me  wi  1  ! 
he  f.e  comparison  of  the  resulting  waveforms  with  the  initial  waveform.  A;-  an 
e:  imp  -  if  a  rectangular  purse  were  to  lie  resolved  into  one  continuous  wav. 
r(rn  u  i\r.  .m:  i*>.  :-:i:nnt  ion  would  not  be  good..  Two  term.,  would  be  better .  As 
i  .  i:  the  e.'Mmpl  e  iu  Figure  ,  considering  the  ro.'t  angular  function  -r  an 

odd  function,  adding  i-oth  first  -rid  third  ..arm.).-,  a..,  gives  the  approx  •  a!  'or. 
of  the-  original  function.  In  order  to  o' In  the  corners  of  the  reels  ...mar 
pulse,  di  scent  v.uit  U*s  being  cri*  ica'  p-o  id  s  and  diilicult  points  to  '..up  rax 
mate  In  -i  Four*  er  series  sense,  a  fair".-  1  i,  e  number  ol  r eras ,  say  .  r 
.'re,  would  be  r --.quirt-.!  to  appro  ith  'lie  edge.-  of  -be  rectangular  -pul  •.«. 
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4.  The  Fourier  Series  of  the  periodic  function  is  defined  by 


I 

i 

f(t)  =  I  aneJn^Lt  j 

—  m  A 
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B.  FOURIER  TRANSFORM  THEORY.  Much  of  the  pattern  recognition  work  associated 
with  ultrasonics,  uses  the  Fourier  transform  of  a  time  domain  signal  as  a 
feature  source.  A  feature  is  a  parameter  defined  on  a  function.  Consider  the 
graphical  representation  of  a  function  shown  below  in  Figure  5. 


Figure  5.  Example  of  Fourier  Spectrum  Parameterization 

Three  possible  features  are  shown  on  the  illustration.  The  Fourier  trans¬ 
form  will  be  reviewed  in  this  section  because  of  its  importance  in  ultrasonic 
work. 

By  definition,  the  Fourier  transform  of  a  function  of  time  f(t)  is  a 
function  of  angular  frequency  F(u  ),  given  by  the  relationship 


v  nere 


F  (  w) 


f(t)e 


-.1 


dt 


1  -  t 

e  '  =  co  Slot  -  J  sinrt 


(j  = 


Re  |F(  w) 

Im  iF (  ,i‘ ) 


f(t)  COS  „;t  dt 


-f (t)  sin  ut  dt 


(real  part) 


(imaginary  part) 


Terms  usually  associated  with  the  Fourier  transform  are  Power  Spectrum  and. 
Phase  Angle.  These  are  defined  heiow. 

2  2  'J 

Power  Spectrum  V  (  u  ■  )j  =  (Re[F(  a)])  +  (Im[F(e)  )  ])  “ 

Phase  Angle  :  $(u.  )  =  tan  l(lm[F(u>  )]/Re[F(w  )  ]) 

An  example  of  -i  titr.f:  function  along  with  its  spectrum  and  plTase  angle  is  given 
in  Figure  n  on  the  following  nag.:.. 
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Power  Spectrum 


Phase  Angle 


The  similarity  between  the  Fourier  Transform  and  the  Fourier  Series  should  be 
noted.  It  c.an  he  shown  that  discrete  spectrum  resulting  from  Fourier  Series 
analysis  has  the  Fourier  Transform  continuous  spectrum  as  its  envelope.  See 
figure  7. 


Figure  7.  illustration  of  the  Relationship  Between 
Fourier  Series  and  Fourier  Transforms 
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III.  SIGNAL  PROCESSING 

A.  ANALOG  TO  DIGITAL  CONVERSION.  There  are  basically  two  types  of  signals 
encountered  in  ultrasonics.  They  are  generally  called  "narrow  hand"  and 
"broadband".  These  terms  refer  to  the  spectral  characteristics  of  a  signal. 
These  concepts  will  be  made  clear  by  the  use  of  a  simple  correlation  that 
exists  between  the  time  domain  and  the  frequency  domain. 

1.  The  Fourier  spectrum  of  a  continuous  wave  of  frequency  i',  is  a  spike 
located  at  fQ  in  the  frequency  domain.  This  would  be  the  ultimate  narrow  band 
signal.  The  Fourier  spectrum  of  a  single  spike  in  the  time  domain  would  be  a 
constant  extending  from  zero  to  infinite  frequency.  This  would  be  a  perfect 
broadband  pulse..  See  Figure  8,  an  illustration  of  these  types  of  signals.  An 
intuitive  correlation  that  might  be  made  is  that  the  longer  the  signal  duration, 
the  narrower  the  frequency  spectrum.  These  are  the  kinds  of  signals  that  are 
generally  processed  during  ultrasonic  analyses. 

2.  The  important  question  is  what  happens  when  one  has  only  a  finite  number 
of  sample  points  from  a  signal.  The  points  will  be  digitally  processed  to 
obtain  a  Fourier  spectrum.  Is  this  spectrum  a  reasonable  representation  of  the 
true  frequency  content  of  the  continuous  signal?  Following  the  digitization 
process  step-by-step  will  illustrate  some  of  the  problems  that  do  occur. 

a.  A  time  signal  theoretically  has  a  Fourier  transform.  Let  us  track 
the  effects  that  processing  this  time  signal  has  on  the  theoretical  or  true 
spectrum.  First,  the  signal  is  sampled  at  some  rate,  say  T;  that  is,  data 
obtained  every  T  seconds.  Essentially,  the  time  signal  is  multiplied  by  a 
train  of  delta  functions  spaced  T  seconds  apart.  In  the  frequency  domain,  this 
corresponds  with  a  pulse  train  separated  by  l/T  frequency  units.  See  Figures 
9a  and  9b . 

b.  Since  only  a  finite  number  of  samples  can  be  processed,  the  sampled 
waveform  must  he  truncated.  This  is  again  a  multiplication,  but  this  time  by 

a  rectangularly  shaped  function.  The  transform  of  such  a  function  is  a  sine 
function.  See  Figures  9c  and  9d .  This  multiplication  also  translates  into  a 
convolution  in  the  frequency  domain.  See  Figure  9e.  The  sampled  and  truncated 
lime  domain  signal  now  has  a  distorted  periodic  frequency  profile.  This  con¬ 
tinuous  spectrum  must  also  be  sampled  and  truncated.  These  results  are  shown 
in  Figures  9f  and  9g. 

c.  Particular  attention  should  be  paid  to  Figures  9c  and  9d .  Figure  9 

shows  that  if  the  sampling  rate  is  not  high  enough,  considerable  spec  ral  ever 
lap  may  occur,  thus  distorting  the  true  spectrum.  The  term  applied  to  this 
suboptimal  sampling  is  called  Aliasing.  There  is  a  theorem,  called  the  '^t 

sampling  theorem,  which  states  if  the  sampling  rate  is  at  least  twice  the  .  .  h- 
est  frequency  contained  in  the  signal,  then  Aliasing  will  not  occur.  See 
Figure  ' Oe . 

d.  Figure.  ‘!d  shows  that  a  rectangular  window  function  has  a  spectrum 
with  many  side  lobes  in  it.  Convolving  this  with  periodic  spectrum  of  Figure-  fL 
introduces  ripples.  Thi  type  of  distortion  is  called  leakage.  This  problem  has 
been  under  study  for  many  years  and  certain  window  functions  developed  that  have 
minimum  side  lobe  energy. 
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3.  Real  world  signals  also  liave  the  additional  component  of  noise.  Noise 
may  be  grouped  into  several  categories.  The  two  items  just  alluded  to.  Aliasing 
and  Leakage,  might  be  considered  sampling  rate  noise  and  mathematical  noise. 
Appropriate  steps  may  be  taken  to  eliminate  this  type  of  noise.  Quantitization, 
that  is,  partitioning  the  signal  into  discrete  levels,  also  introduces  noise. 

An  analog  signal  having  values  located  midway  between  two  adjacent  quantum  levels 
has  a  50-50  chance  of  being  quantitized  into  either  level.  Other  kinds  of  noise 
include  electronic  noise  and  thermal  noise. 

4.  Ultrasonic  signals  may  be  considered  as  belonging  to  the  realm  of  random 
processes.  That  is,  each  time  a  signal  from  the  same  reflector  is  viewed,  it 
obtains  slightly  different  values.  The  distribution  of  these  variations  may  be 
considered  random. 

a.  One  way  to  get  a  better  estimate  of  the  true  value  of  a  signal  is 
to  average  over  a  set  of  similar  signals.  It  can  be  shown  that  averaging 
decreases  the  effects  of  quantitization,  electronic  and  thermal  noise. 

The  theory  behind  simple  signal  averaging  is  as  follows: 

Let  x^  denote  an  observed  signal. 

Let  s  denote  the  true  signal. 

Let  n^  denote  the  noise  content  of  the  ith  observation,  then 
X|  =  s  +  n-^ 

Consider  collecting  an  ensemble  of  N  of  these  signals: 

*1  =  s  +  nl 
v-2  -  s  +  n2 


XN  *  s  +  nN 

Averaging  then  involves  summing  and  division  by  N  (simply  scaling) 


N  N 

'L  Xj  -  Ns  1  y  n  ^ 

i  =  1  i  =  1 

N  N 

*i  “  »  +  ;  n  . 

1  =_J  i  =  J 

"  N  '  *  tf 
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For  N  large  enough 

N 

1  =  1 

is  small,  and  division  by  N  makes  the  second  right -hand  term  even 

smaller ; 

N 

"  x-  '  s  for  proper  N. 

i  =■  1 

N 

b.  Averaging  Introduces  the  new  problem  of  jitter.  Jitter  is  time 
shifting  of  signal  components  due  to  the  variability  of  the  trigger  levels 
necessary  to  initiate  the  digital-to-analog  sampling  process.  Most  often, 
this  occurs  when  instruments  are  first  turned  on.  After  a  period  of  time 
though,  trigger  levels  tend  to  stabilize.  If  jitter  is  still  present,  a  pro¬ 
cess  called  correlation  detection  may  be  used. 

c.  A  signa1  is  captured  and  stored.  A  new  signal  is  then  obtained 
and  cross-correlated  with  the  stored  one.  The  maximum  value  of  the  cross¬ 
correlation  function  locates  on  the  time  axis  the  number  of  sample  units  the 
new*  form  has  been  shifted  away  from  the  original.  The  second  is  time  shifted 
into  agreement  with  the  first  one.  The  two  signals  are  then  averaged  and 
stored  as  a  new  reference  signal.  The  process  may  be  iterated  until  it  is 
thought  that  sufficient  noise  reduction  has  occurred. 

B.  SIGNAL  PROCESSING  DEFINITIONS.  Many  textbooks  on  signal  processing  are 
available,  many  of  which  could  be  useful  in  understanding  the  difficulties  and 
possible  improvements  of  ultrasonic  signal  analysis.  Highlights  ar.d  definitions 
of  several  terms  that  are  encountered  in  the  signal  processing  f  iel  are  out- 
1 tr.ed  below.  A  review  of  the  terms  and  basic  concepts  will  serve  to  introduce 
the  subject  and  its  many  mathematical  and  electrical  engineering  areas  of  study. 
The  concept  of  a  transfer  function  is  illustrated  in  Figure  11.  This  idea  is 
used  very  often  in  signal  processing  systems  and  is  referred  to  in  the  electri¬ 
cal  engineering  and  systems  analysis  literature.  The  transfer  function  is  often 
treated  as  a  black  box  where  the  output  function  can  be  formulated  as  a  function 
of  an  input  function  by  way  of  the  black  box  or  transfer  function. 


Out  -  f(ln)  where  black  box  (F)  represents  a  transfer  function. 
Figure  11.  Transfer  Function  Concept 
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1.  S ignal .  A  physical  disturbance  that  contains  information.  The  disturb¬ 
ance  may  vary  with  time,  temperature,  pressure,  etc.  A  traffic  1 ifeht  is  a 
signal.  The  color  is  the  disturbance  and  the  information  is  stop,  go,  or  cau¬ 
tion.  In  ultrasonics,  voltage  variations  versus  time  are  typical  signals. 

2.  Processing ■  The  automatic  extraction  of  information  from  a  signal. 

For  instance,  the  Fast  Fourier  Transform  is  a  computer  processing  technique  used 
to  extract  the  frequency  information  contained  in  a  signal. 

3.  Analog  Signal.  A  signal  that  is  continuous  in  time. 

4.  Digital  Signal.  A  signal  that  occurs  in  discrete  time  intervals,  usuall 
represented  as  a  sequence  of  numbers,  each  being  restricted  to  an  integer  multi¬ 
ple  of  a  fundamental  unit  called  a  quantum. 

5.  Sampling .  When  it  becomes  impractical  to  process  a  signal  in  continuous 
time,  samples  of  the  signal  are  taken  at  a  set  of  predetermined  discrete  times. 

6.  Quant itizat ion.  Restriction  of  sample  values  to  a  finite  number  of  pos¬ 
sible  values. 

7.  A/D  Conversion  (Analog  to  Digital)  .  The  procedure  for  sampling  analog 
signals  and  thereby  converting  the  analog  information  into  a  digital  sequence. 

8.  Spectral  Analysis.  The  evaluation  of  the  frequency  content  of  a  signal. 
This  is  usually  performed  by  Fourier  transforming  the  signal  and  noting  those 
areas  which  have  significant  values. 

9 .  Bandwidth. 

a.  The  highest  frequency  above  which  there  is  no  significant  content. 

(0  to  10  MHz,  0  to  20  MHz,  etc.) 

b.  When  prefixed  with  3  dB  or  6  dB,  the  width  of  the  spectral  profile, 
at  respective  amplitudes,  0.707  of  the  peak  value  and  0.5  of  the  peak  value 
respectively.  (5  to  10  MHz,  2  to  6  MHz,  etc.) 

10.  Sampling  Theorem.  A  theorem  that  stater-  an  analog  signal  must  be 
sampled  at  a  rate  of  1/(2  f max) ,  where  fmax  is  the  highest  frequency  contained 
in  the  signal,  in  order  to  insure  a  faithful  representation  of  the  signal  in 
the  digital  domain. 

11.  Aliasing .  The  misinterpretation  of  a  signal  due  to  too  low  a  sampling 

rate.  When  one  looks  at  an  airplane  propeller,  it  seems  to  be  going  sfw  (or 
even  backward.;).  This  is  due  to  the  fact  that  the  eye  cannot  sample  the  visual 
information  at  a  high  enough  rate.  This  can  produce  incorrect  electronic 
signal,,  since  a  lower  sampling  rate  could  actually  represent  several  higher 
frequency  signals.  See  Figure  12.  This  can  occur  when  the  sampling  rate  is 
too  low,  .....  . 


12.  Filter.  A  mathematical  algorithm  or  computational  procedure  used  to 
process  digital  data.  These  algorithms  are  implemented  either  in  software 
(using  computer  language)  or  in  hardware  (actual  digital  circuitry). 
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Figure  12.  Aliasing 


13.  Transfer  Function.  A  mathematical  representation  of  the  effects  a 
physical  system  will  have  on  an  input,  regardless  of  what  that  input  may  be. 
It  represents  the  inherent  characteristics  of  the  system. 


14.  Linear  Filter.  A  filter  where  the  property  of  superposition  is  known 
to  hold.  That  is,  if  two  inputs  are  added  together,  the  output  is  the  sum  of 
the  two  individual  outputs  obtained  from  each  input  alone. 

15.  Auto  Correlation.  A  statistical  measure  of  the  expected  value  of  the 

product  f (n)  •  f(n  +  k),  where  f (n)  is  a  signal  at  time  n,  and  f (n  +  k)  is  the 

value  of  the  signal  k  units  later,  k  is  called  the  lag. 

AC( t  )  -  lira  f  T  f (t)f  (t  +  i  )dt 
T-*~  2T  J  -T 

This  'unction  might  be  useful  for  the  alignment  of  similar  signals  displaced 
relative  to  each  other  in  time. 


16.  Power  Spectrum.  The  magnitude  of  the  Fourier  spectrum  of  a  signal. 

IV.  Power  Spectral  Density.  The  Fourier  transform  of  the  auto-correlation 
of  a  signal. 

In.  Signal  to  Noise  Ratio.  (A/0n)  The  ratio  of  peak  amplitude  to  the 
root  mean  square  of  the  noise  in  a  signal. 

1 9 .  White  Noise.  (Wideband  Gaussian  noise)  A  signal  whose  power  spectrum 
is  a  constant. 

20.  Cross  Correlation.  A  statistical  measure  of  the  expected  value  of  the 
product  f(;i)  •  g(n  +  k)  when  f  (n)  is  a  signal  at  time  n,  and  g(n  +  k)  . 
another  signal  at  time  n  H  k. 

i'Rf  '■  *  liir.  i  _  t  T  f  (t)g(t  1  i  )dt 

v  -  27  J  -T 
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a.  Cross  correlation  may  be  useful  for  the  detection  and  location  of 
a  signal  which  is  embedded  in  noise,  since  only  those  components  which  are  not 
noise  will  have  non-zero  components, 

b.  Intuitively,  correlation  is  similar  to  a  template  matching  pro¬ 
cedure.  One  signal  is  displaced  relative  to  another  (or  the  same)  and  the  two 
compared.  That  displacement  where  the  two  signals  agree  the  most  is  where  the 
correlation  function  is  maximum  and  where  they  coincide  the  least,  it  is  a 
minimum. 


21.  Deconvolution .  The  process  of  solving  for  the  function 
K(t,  t)  in  the  equation 

f(t)  K(t , t  )dt  =  g(T  ) 
given  the  functions  f(t)  and  g(i  ). 

K(t,T  )  is  known  as  the  kernel  of  the  integral  . 


a.  This  is  related  to  the  theory  of  linear  systems  where  it  is  shewn 
that  if  f(t)  is  an  input  to  a  system,  g(t  )  is  the  output  with  K(t,t  )  being 
the  system  transfer  function. 


f(t) 


K(t, t  ) 


g(  T  ) 

— > 


input 


transfer  function 


output 


!'.(w) 


b.  Fourier  analysis  also  shows  that  if  F(w),  K(w) ,  and  G(w)  are  the 
Fourier  transforms  of  f(t),  K(t,~  ),  and  g("  )  respectively,  then 

G(w) 

F  (w)  . 

This  complex  division  is  known  as  deconvolution.  Deconvolution  could  be  useful 
in  ultrasonic  analysis,  lor  example,  in  transducer  compensation  analysis,  that 
is  making  one  transducer  appear  to  be  another,  perhaps  more  suitable  transducer. 

22.  Signal  Averaging.  A  mathematical  process  (°r  filter)  which  extracts 
the  central  tendency  of  a  signal.  It  .in  usually  used  to  eliminate  noise. 
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IV.  BASIC  PATTERN  RECOGNITION  AND  CLASSIFICATION  PROCESSES 

A.  SIMPLE  TECHNIQUES.  One  manner  in  which  man  has  been  expanding  the  general 
capabilities  of  the  digital  computer  is  the  concept  of  artificial  intelligence. 
Hopefully,  the  digital  computer  will  perform  perceptual  tasks  assigned  to  it. 
Pattern  recognition  has  received  considerable  attention  in  this  area. 

1.  The  basis  for  using  pattern  recognition  in  solving  practical  problems 
lies  in  the  assumption  that  a  logical  means  exists  to  train  a  tcmputer  to 
associate  given  data  with  a  particular  test  response. 

2.  The  basic  form  for  a  classification  process,  as  illustrated  in  ligure 
13,  consists  of  a  data  acquisition  process,  parameter  or  feature  extractor, 

and  a  classifier.  Mote:  Before  the  system  functions  correctly,  the  classifier 
must  he  trained  to  provide  a  solution  having  a  higher  probability  of  being 
correct  than  the  system  previously  used. 


figure  13.  The  Pattern  Recognition  and  Classification  Process 
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3.  Figure  14  illustrates  how  Probability  Density  Function  (1 
be  used  to  obtain  feature  effectiveness.  Note  that  feature  1  ha: 
the  same  values  for  the  two  different  classes  in  the  first  illus 


Probability 


Poor 

reature 


r  y.ce  1  lent 
Feature 


i  i;'.uiv  14.  Sample  Probability  Density  Function  Curvet 
a  Typical  .'  Class  Classification  Problem 
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There  Is  no  differentiation  capability  for  the  feature.  "eaturos  2  and  3 
are  also  somewhat  limited  in  their  capability  for  classifying  the  problem 
as  either  class  1  or  2.  Note  that,  however,  feature  4,  possibly  center 
frequency  of  the  reflected  signal  p.ovides  for  us  a  clearer  differentiation 
between  class  1  and  class  2,  and  of  course,  feature  5,  possibly  a  6  dB 
down  frequency  bandwidth  provides  for  us  an  excellent  feature  for  differ¬ 
entiating  the  two  classes  with  100%  reliability.  Quite  often,  however, 
the  features  fall  in  categories  such  as  those  illustrated  in  ‘ natures  2 
and  3.  The  probability  density  function  curves,  however,  provide  us  with 
insight  into  the  difficulties  that  might  be  associated  with  the  classifica¬ 
tion  problem  in  pattern  recognition.  Obviously,  if  results  similar  to  those 
for  feature  5  occur,  the  solution  to  the  problem  is  complete.  If  this  is  not 
the  case,  it  is  often  desirable  to  examine  two-dimensional  feature  profiles 
as  illustrated  in  Figure  15. 


a  feature  2  b 


A 


A 


C 


d 
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4.  If  we  were  to  plot  in  two-dimensional  space  a  feature  1  versus  feature 
2,  and  were  lucky  enough  to  obtain  data  clustering  as  either  hall-like,  ring 
type,  string  type,  and  so  on,  as  illustrated  in  Figure  15,  again  promise  for 
obtaining  a  reasonable  solution  exists.  The  combination  of  two-dimensional 
profiles  should  be  plotted  for  all  promising  type  features  as  indicated  by  the 
probability  density  function  analysis.  If  the  cluster  situation  in  two-dimensional 
feature  space  is  not  useful,  it  then  becomes  necessary  to  employ  more  sophisti¬ 
cated  algorithm  analysis  from  pattern  recognition. 

5.  The  next  approach  that  could  be  used  for  finding  a  solution  to  this 
classification  problem  would  be  to  consider  aspects  of  Baves'  decision  theory 
in  combination  with  the  results  obtained  from  the  probability  density  function 
analysis.  A  fuzzy  logic  decision  algorithm  could  be  established  that  classifies 
a  certain  percentage  of  the  total  number  of  test  situations  encountered  with 
100%  reliability  vector  or  index  of  performance.  A  sample  fuzzy  logic  algorithm 
is  illustrated  in  Figure  16. 

6.  if  problems  are  encountered  in  this  approach  or  a  different  kind  of 
reliaoility  parameter  is  required,  additional  concepts  in  pattern  recognition 
must  be  explored.  As  an  example,  an  index  of  performance  vector  that  provides 
us  with  100%  classification  even  though  the  algorithm  reliability  is  only  80% 
or  90%,  could  be  useful  for  many  applications.  Keep  in  mind  that  the.  index  of 
performance  criteria  depends  on  the  classification  levels  and  possible  loss 
function  analysis,  loss  functions  can  be  incorporated  into  the  index  of  per¬ 
formance  evaluation.  As  an  example,  an  item  classified  as  class  1  that  is 
really  class  2,  may  not  represent  a  serious  error.  On  the  other  hand,  calling 
a  class  2  situation  class  1,  could  be  serious.  Suppose  we  are  doing  a  flaw 
detection  in  metals.  If  class  1  represents  porosity  and  class  2  cracking 
classification  of  porosity,  cracking  is  obviously  not  serious  since  it  results 
possibly  in  small  financial  loss. 

% .  FISHED  LINEAR  DISCRIMINANT .  One  of  the  major  problems  encountered  in 
pattern  recognition  work  is  the  vastness  of  the  feature  space.  Proceaures  that 
are  analytically  and  computationally  manageable  in  low  dimensional  spaces  be¬ 
comes  impractical  in  higher  dimension  spaces.  An  ideal  space  is  the  one- 
d teens ’ onaJ  space  represented  by  a  straight  line.  The  advantage  of  a  Fisher 
.inear  Discriminant  is  that  it  projects  all  of  the  data  from  an  N-dimensional 
space  onto  the  best  line  for  separating  the  data.  Once  the  data  has  been  pro¬ 
jected  onto  the  line,  a  threshold  value  may  be  selected  which  will  separate  the 
data  into  two  classes.  Thus  the  Fisher  Linear  Discriminant  is  ideally  suited 
to  a  two-class  problem,  as  illustrated  in  Figures  17  and  18. 


Feature  §  1 
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Figure  1Y.  Feat  ures  Projected  Onto  Line  Determined  by  the 
Fisher  Linear  Discriminant  -  100$  Reliability 
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Figure  18.  Comparison  of  Two-Dimensional  Space  and 
Fisher  Linear  Discriminant  Data  Scatter 
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The  simplest  way  to  project  an  N -dimensional 
a  dot  product. 


This  may  be  written  as 
y  =  wC3c 


space  onto  a  line  i  r.  by 
a,ixi  =  y  (a  scalar) 


forming 


1.  Consider  a  set  of  K  samples  (vectors)  divided  into  two  classes,  and 
C2  with  Ni  samples  and  samples  respectively  (K  =  Nj  +  N?) .  (If  the  samples 
fall  into  two  intermingled  clusters,  the  result  desired  is  that  the  clusters  be 
shrunk  and  their  means  well  separated.)  See  Figure  15.  Another  way  of  express¬ 
ing  this  is  to  say  that  the  difference  of  projected  means  is  to  tie  maximized  and 
the  scatter  within  each  cluster  is  to  be  minimized.  The  mathematical  formulation 
of  this  problem  is  given  below. 


Let  x  denote  a  typical  D-dimensional  vector.  Then  for  class  1  samples, 
the  vector  mean  is 


% =  E  - e  ci 


and  for  class  2 


E2  x/N  *  2  E  C2‘ 


The  projected  means  would  be 

N 

1  V1 

u  y 

a2  »,  2.  y 


y  projected  from 
y  projected  from  C„ 


that  is 


ft. 


1 


1  ^1*  t  t 

N.Z  = 


the  squared  difference  of  projected  means  is  then 

l2  _  '  _t  .t  ,2  ,  t 


~  *2  1 


i—  ~  —  —2 '  *  I  a  (Ej  “  b9) 

—  ~  S?2^— l  “ 
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this  may  be  rewritten  as 


j  I  ^  “  w^Mw 


where 


M  * * * §  (m^  -  ®2 > 

The  scatter  S  for  each  class  may  be  defined  as 

b,  “S'  (x-m1)(x-m1)t,  xe  C. 
i  1  l  i 

S2  m<y  ~  —2)  "  —2^’  —  £  C2 

Also,  a  variance  measure  of  the  entire  data  set  may  be  defined  as 

v  =  i  (§  2  +  $  2) 

K  1  1  2  ’ 


where 

sx2-V  (y-#,)2 

y  e  projected  C 

s,2  (y  -  V2 

y c  projected  C 

or 

s  2  V"  ,  t  t  .2 

S  =  >  (  w  x  -  w  m  ) 

1  ~  ~  -  “I 

x  e 

*  ^  wt(x  -  m,)(x  -  nu)  ts  xe  C 

~  ~x  ~  L 

O 

§  2  c  wtb]v)  (likewise  for  §2  ) 

with  as  above. 

Let  S  -  S  +  S0 
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then 


k  wCS^w  + 


wt(S1  +  S2)w 


-  2  ~  2  t 

Sl  +  s2  =  wSw 


The  Fisher  Linear  Discriminant  is  defined  as  wSt  for  which 

J(w) 


“  tt2 


*  2  *2 
S1  +52 

assumes  its  maximum  value.  It  should  be  noted  that  in  a  sense  when  |fc  -  m 
is  a  maximum  and  is  a  minimum  J  (w)  is  maximum.  ^  ‘ 


In  terms  of  the  above  formulations 

J(v) 


Zmw 

wCSw 


this  is  known  as  a  generalized  Rayliegh  quotient.  The  solution  w  that  maximizes 
J  (w)  is  given  by 


w 


S  \m^  -  m2) 


Computationally,  the  vector  means  m^  and  are  first  calculated.  Then  the 
matrices 


sl- 

z 

(x  -  m^) (x  -  m  ) t 

X  £ 

S-  “ 

y 

(.x  -  m.)  (x  -  m_)t 

x  e  C„ 

2 

—  2  —  ~2 

—  2 

are  calculated 

and 

summed  tc  give  S . 

S  is  inverted  and  multiplied  times  c'.e 

difference  of  vector  means  (m^  -  m^) . 
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C.  INTRODUCTION  TO  ADAPTIVE  LEARNING  NETWORKS.  This  section  will  develop  the 
tools  and  concepts  necessary  for  understanding  the  motivating  phi] osophies 
behind  a  learning  network.  There  are  essentially  four  divisions  included  in 
this  section.  The  first  three  will  develop  the  mathematical  machinery  required 
for  conceptualization  of  a  learning  algorithm.  The  next  and  most  critical 
section  will  cover  the  analogy  between  human  learning  and  mathematical 
(computerized)  learning. 

1.  MINIMUM  SQUARED  ERROR  CONCEPTS.  Consider  the  classic  problem  of  find¬ 
ing  a  line  that  approximates  a  set  of  data  in  the  sense  that  the  sum  of  the 
squared  errors  is  a  minimum.  One  wants  to  find  parameters,  say  a  and  b  such 
that  the  line 

y  =  ax  +  b 

is  a  good  estimate  of  the  inherent  functional  relationships  of  data  samples, 
where  the  samples  are  given  as 

Sji  ^1^ 

*  (x2 *  ^2^ 

■  • 

•  • 

V  V 

Given  that  a  and  b  exist,  the  error  between  actual  data  and  the.  estimate  is 

error^  =  -  axA  -  b 

for  each  i.  The  sum  of  squared  errors  is 

N  N 

(errori)2  =^,  0^  '  axi  " 

1  i 

This  last  expression  is  the  one  that  is  to  be  minimized  with  respect  to  the 
parameters  a  and  b.  The  condition  for  a  minimum  is  well  known  from  the  calculus. 
It  is  that  the  partial  derivatives,  with  respect  to  the  parameters,  of  the 
function  equal  to  zero. 


a 

5a 


i 


(erro^r  =  f^L  ^  ^ 

=  ^  2(y±  -  a y.±  -  b)  (-xj 

i 

**  0. 
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The 


2  is  a  constant 


t 


N  N  2  « 


l  (y,  -  axt  -  b)xi  =^xty.  -  a^T  x.  -  x. 
which  can  be  written  as 

^  N  0  N  N 

*£*i  +  bZ>i -t  Vi 

i  i  i 


Likewise 


a  N  N 

3  ,  ,2 


Tb  (error^  -V  (y  -  axi  -  b)  (1)  =  0 
i  i 


N  N 

or  ^_y±  -  a^xi  -  Nb  =  0, 


«  N 

l£Xi  +  b  N  «£  y 
i  i  1 


^  2  f 

a 

- U - - 

N 

4  vi 

N 

5xi  K 

b 

N 

Zu 

X 

V 

*  -A 

Y  w  *=  b 

If  the  matrix  Y  is  nonsingular  then  the  solution  is 

u  >=  Y  "'  b 


or 
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a  ** 


y 


i 


"Zvi  -Z"  *1  -rv. 

^?-<Zv2 


2-  GENERALIZED  MATRIX  APPROACH;  PSEUDOINVERSE.  A  more  general  problem  may 
be  formulated  as  follows:  We  want  to  find  the  components  of  a  vector u,  such 
that  the  accumulated  squared  errors  inherent  in  the  expression 

Xu  =  b 

add  to  a  minimum.  Define  e  =  error  vector,  then  e  =  Xu  -  b. 

Noting  the  sum  of  squared  errors  is  also  the  length  of  error  vectors,  we 
want  to  minimize 

N 

I  iel  I  2  “  1  |Xo)  “  b|  |  2  =2^  (wtxi  -  b^)^  (t  5  transpose) 

taking  partials  (denoted  by  V)  we  have 

V|lxS-t|t  2C^  X  -  b  )x  -  2Xt(Xw-t) 
i  1 

Setting  the  partials  equal  to  zero. 

V  |  |Xu>  -  "b  |  | 2  =  2Xt(Xi-S)  =  0 

t  ■, 

-*■  X  XU3  =  X  b 

If  XCX  is  non-singular  then 

u  »  (XtX)~1Xtb 

is  the  solution  to  our  generalized  problem.  We  can  write 

"u  *=  X*b 

where  X+  *  (X^X)  ^XC  and  Is  called  the  "Pseudoinverse"  of  X. 

As  an  example,  we  will  consider  the  case  where  an  outcome  (result)  depends 
on  two  other  variables,  lor  instance, 

y  "  bo  +  Vl  +  W2X2 


with  N  data  samples 
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Bl*  x2^^*  y\ ) 

s2,  (x1(2),  x2(2),  y2) 


.  (N)  (N)  , 

SN*  (xl  ’  X2  »  yN^ 

(the  superscripts  indicate  the  sample  number)  the  generalized  matTi:; 
approach  is 


x  w  =  "£ 


(1) 

v  (!> 

x2 

(2) 

V  ^ 

X2 

• 

(N) 

*  (N) 

1  •  •  • 

1 

(1) 

x  (2)-'* 
X1 

x  00 
X1 

(1) 

x  (2)  — 
*2 

(N) 

2 

1 

1 

(1) 

(2) 

x2  *•' 

(N) 

X1 

(1) 

(2) 

y  •  •  • 

x  (N) 

X7 

1  x. 


1  X, 


1  X, 


N  i  a\  N  /n 

..  rp  (i)  r-  (i) 

N  <c^ x? 

i  i 


f  a) 


x(D  2.  (i) 

/  (x  x  )  ^-ix  ) 

r  i 
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3.  MULTINOMIALS .  Linear  approximations  always  remain  linear  upon  composi¬ 
tion.  As  an  example,  consider  the  linear  device  below. 


The  device  implements  y  =  w  +  w^x^  +  uj^Xj.  These  devices  may  also  be  used 
in  tandom  or  in  layers . 


y  =  p  +  pn  (to  +  oj, x  +  u„x„)  +  p_(x  +  2,x  +  z-j^O 
o  1  o  11  2  2*  2  o  11  id 

y  -  C  Po  +  p^0  +  P2Zo>  +  <Plwl  +  P221)x1  +  (PlL,2)x2  +  (P2Z3)x3 


y  r-  A  +  Bx^  +  CX2  +  Dx^ 

A,  B,  C,  and  I)  =  constants. 

The  important  point  here  is  that  the  resulting  y  is  still  a  first  order 
approximation  of  the  functional  relationship  inherent  in  the  data.  The  only 
result  to  be  obtained,  regardless  of  topological  structure  is 

y  "  ^Vi 


where  N  is  the  number  of  parameters  or  "features"  that  y  depends  on. 

Data  having  a  relationship  involving  cross-products  and  powers  of  features 
would  be  poorly  approximated  by  this  scheme.  Therefore,  non-linear  approxima¬ 
tions  are  now  considered.  The  simple  case  involving  3  features  is  shown  below 


"1 

*2 

'l 

C3 


'12 


13 


1 


i 
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*  “  Po  P12(u,o  +  Vj  +  «*x2  +  W12X1X2)  +  P13(zo  +  Z1X1  +  Z3X3  +  Z13X1X3) 
P123(wo  +  Vl  +  "fex2  +  “l2Xlx2)(zo  +  Z1X1  +  Z3X3  +  zl3xlx3) 


which  implies  terms  in 


Xl ,  X2,  X3,  xlx2,  Xjx3 
as  expected  and  the  new  interactions 

2  2 

X2X3  j  X1X2X3  >  ,  xx  x2x3 


It  is  noted  that  cross-terms  and  power  terms  are  automatically  introduced  by 
layering.  This  suggests  that  the  inclusion  of  non-linear  terms  and  layering 
gives  a  broader  range  of  approximating  power. 

4.  BASIC  ALN  CONCEPTS .  The  nonmathematical  concept  of  adaptive  learning 
is  easily  comprehended.  Experiences  are  recorded  by  an  individual  and  are  put 
into  a  scheme  or  logic  by  some  undefined  method.  He  has  a  theory  about  his 
experiences.  When  presented  with  new  experiences,  his  theories  are  tested. 
Some  theories  are  modified,  some  are  disregarded,  and  others  remain  unchanged. 
In  this  sense  the  individual  "adapts"  to  his  environment.  The  measure  of  how 
well  his  theories  perform  is  the  frequency  with  which  he  makes  successful 
decisions  on  new  experience. 

a.  The  jump  from  the  philosophical  domain  to  the  mathematical  one  is 
made  most  easily  by  defining  terms  or  creating  a  vocabulary. 

(1)  Feature  -  a  quantitative  measure  of  an  experience.  Examples: 
temperature,  velocity,  mean  value,  peak-to-peak  amplitudes,  etc. 

(2)  Feature  Vector  -  a  column-like  array  of  features. 

r temperature  I 
velocity  j 


mean  value 


frequency 
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b.  Mathematically,  once  the  parameters  are  defined  they  may  be  tagged. 

x^  -  temperature 
X2  -  velocity 
X£?  -  mean  value 
xN  -  frequency 


The  feature  vector  may  be  written  as 

X1 

*2 

x22 


Other  terms  will  be  defined  as  necessary.  From  the  previous  section,  we  have 
good  reason  to  suspect  that  the  use  of  nonlinear  expressions  and  layering  will 
lead  to  much  broader  theories  than  linear  relationships.  We  will  restrict  our¬ 
selves  to  expressions  of  the  form.  2 

yij  “  ao  +  Vi  +  Vj  +  Wj  +  Vi  +  Vj 

where  i  4  j.  We  will  be  interested  in  all  experiences  or  features  at  the  outset 
because  we  are  not  aware  of  any  particular  relationship. 

c.  This  is  best  done  by  considering  all  possible  combinations  of 
features.  Given  N  features,  there  will  be 


N(N  -  1) 

possible  combinations.  As  an  example,  consider  N  =  4.  A  typical  feature 
vector  would  he 


Xt 

4— 

x3 

x4 
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This  initial  try  will  be  called  a  "1st  layer".  The  data  that  is  put  into  this 
layer  is  called  "fitting"  data.  The  word  fitting  referring  to  the  previously 
developed  method  that  solved  for  u  0,  u  ...,  oi  5  (pseudo inverse,  least 
squares  concepts) . 

d.  The  data  is  submitted  to  each  box  in  the  1st  layer  and  the  coeffi¬ 
cients  for  each  box  are  solved  for  using  the  matrix  methods  at  the  beginning 

of  this  section. 

e.  The  expression  or  algorithm  we  have,  along  with  the  coefficients, 

constitute  our  initial  theories  about  our  experiences  (features) .  To  decide 
which  theories  ("boxes")  are  useful  or  not,  we  must  subject  them  to  new  experi¬ 
ences.  This  set  of  data  is  called  the  "selection"  set.  Those  boxes  which 

perform  well  are  retained,  while  those  that  perform  poorly  are  disregarded. 

Assume  that  boxes  3,  5,  and  6  performed  poorly.  Then  we  have 


We  note  that  feature  x^  may  be  disregarded  at  this  1st  layer,  although  the  pos¬ 
sibility  exists  that  it  might  be  useful  at  another  level  in  the  network. 

D.  FACTOR  ANALYSIS  IN  PATTERN  RECOGNITION.  As  part  of  the  pattern  recognition 
capability  being  developed  at  Drexel  University  for  the  Navy,  we  have  incorpora¬ 
ted  a  factor  analysis  program  to  efficiently  select  the  features  most  rucial 
in  the  flaw  detection  problem.  This  program  aids  in  the  identification  of 
relationships  among  the  variables  and  may  contribute  to  the  discovery  of  new 
features  which  will  improve  our  ability  to  discriminate  between  different  pot 
tern  classes.  Also,  through  the  use  of  this  program  we  may  be  able  to  reduce 
the  number  of  measurements  needed  to  make  a  successful  discrimination. 

1.  FACTOR  ANALYSIS.  Factor  analysis  is  an  extension  of  principal  component 
analysis  which  determines  the  minimum  number  of  independent  dimensions  needed  to 
account  for  most  of  the  variance  in  an  original  set  of  variables.  Factors  are 
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derived  measurement  constructions  which  may  produce  parsimony^.  orthoRonali tv , 
increased  reliability  and  increased  normality  over  the  observation  measures 
from  which  they  are  derived. 

a.  The  digitized  ultrasonic  waveform  can  be  represented  as  a  signal 
vector  in  an  n  dimensional  time  space,  where  each  dimension  corresponds  to  the 
signal  voltage  at  a  different  latency  ^  point  in  the  analysis  epoch.  Princi¬ 
pal  component  analysis  can  identify  the  actual  dimensionality  of  the  "signal 
space"  containing  a  set  of  such  vectors  representing  waveforms  from  many  deri¬ 
vations  in  the  same  experiment  or  from  the  same  derivations  :n  many  experiments. 
One  can  then  construct  a  parsimonious  description  of  each  waveform  as  a  linear 
combination  of  a  set  of  terms.  Each  term  defines  the  relative  contribution  of 
each  feature  to  that  waveform.  These  linear  equations  enable  great  data  com¬ 
pression,  since  any  waveform  in  that  signal  space  can  be  described  as  some 
combination  of  the  same  basic  factors.  Thus,  patterns  of  factor  weightings 

can  be  used  to  construct  clusters  of  waveforms  with  distinctive  morphology. 

The  p  linear  combinations  of  the  variables  (principal  components)  are  designed 
to  capture  as  much  of  the  variation  in  the  data  as  possible  while  at  the  same 
time  being  linearly  independent  of  all  the  other  principal  components. 

b.  A  principal  component  Yj  is  a  linear  combination  of  p  variables. 

Thus 

Yj  =  BjXjj  +  H2X2j  +  ...  +  BpXpj  ,  j  =  1,  2,  ...,  m 

is  a  principal  component  with  unknown  coefficients  Bj,  B2,  ....  Bp.  In  matrix 
notation  let 


B1 

Y1 

*11. 

. . . . ,  xpl 

b2 

y2 

x12 » 

. xp2 

,  I  - 

. 

,  and  X  = 

• 

BP 

xlm> 

’ *  *  *  >  xpm 

» 

- 

Then  we  can  write  the  principal  component  as 
Y  =  XB 


1.  dimensionality  reduction 

2.  not  yet  apparent,  but  there 

For  a  given  13,  the  sample  variance  of  Y  is  given  by 
var  Y  =  B'SB 

where  3  is  the  sample  covariance  matrix. 
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c.  The  first  problem  in  principal  component  analysis  is  to  find  the 
principal  component,  Y^,  with  the  maximum  variance.  The  problem  then  is 

maximize  B'3B  subject  to  JJ'B  =  1 

d.  If  we  let 

4  =  B'SB  -  A  (B'B  -  1) 

where  A  is  a  Lagrange  multiplier,  the  vector  of  partial  derivatives  is 

T£  =  2SB  -  2  \B 


which,  upon  being  set  to  zero,  reduces  to 
(S  -  U)B  =  0 

To  solve  this  equation,  we  find  the  p  characteristics  roots  of  the  covariance 
matrix  S,  thus  to  maximize  the  variance  of  Y,  we  choose  the  largest  character¬ 
istics  root  of  the  covariance  matrix  S.  The  first  principal  component  is  given 
by 

ll  = 


with  variance  equal  to  A  j . 

e.  In  general,  when  there  are  p  variables,  the  first  principal  com¬ 
ponent  Yj,  is  a  linear  combination  of  the  p  variables  with  coefficients  equal 
to  the  normalized  characteristic  vector  associated  with  the  largest  character¬ 
istic  root  of  S.  The  second  principal  component,  Y^>  is  the  linear  combination 
of  the  p  variables  with  coefficients  equal  to  the  normalized  characteristic 
vector  associated  with  the  second  largest  characteristic  root  of  S,  and  so 
forth  up  to  the  pth  principal  component,  Y  .  Each  principal  component  has 
variance  equal  to  its  corresponding  characteristic  root  and  each  component 
merely  defines  the  p  axes  of  the  p-dimensional  concentration  ellipsoid  and  is 
computed  by  the  program. 

f.  Thus  far  in  the  development  of  the  program,  we  have  the  standard 
packages  to  accomplish  the  following: 

.  CORRE  -  to  find  means,  standard  deviation,  and  the  correlation 
matrix 

.  EICEN  -  to  compute  eigenvalues  and  associated  eigenvectors  of 
the  correlation  matrix 

.  TRACE  -  to  select  the  eigenvalues  that  are  greater  than  or  equal 
to  the  control  value  specified  by  the  user 

.  L.OAI'  -  to  compute  a  factor  matrix 

.  VARMX  -  to  perform  varimax  rotation  of  the  factor  matrix 
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The  program  has  been  debugged  and  employed  on  a  small  set  of  data.  The  prelim¬ 
inary  results  are  described  in  the  pages  which  follow. 

2.  RESULTS  AND  DISCUSSION.  The  results  of  the  factor  analysis  are  sum¬ 
marized  in  the  matrix  of  common  factor  coefficients  presented  in  Table  II.  Each 
entry  a^j  of  the  matrix  shows  the  importance  of  the  influence  of  factor  j  upon 
variable!.  The  factor  loadings  indicate  the  net  correlation  between  each  factor 
and  the  observed  variables  or  features. 

a.  The  interpretation  of  factor  loadings  may  also  be  made  in  terms  of 
the  squares  of  the  coefficients.  Each  (a-jj)2  represents  the  proportion  of  the 
total  unit  variance  of  variable  i  which  is  explained  by  factor  j ,  after  allowing 
for  the  contributions  of  the  other  factors.  Thus  in  the  first  row  of  the  table, 
it  can  be  seen  that  90%  of  the  variation  in  Feature  1  can  be  explained  by 
Factor  I.  Factor  2  explains  only  0.8%;  Factor  3,  2.9%;  etc. 

b.  The  matrix  of  factor  loadings,  in  addition  to  indicating  the  weight 
of  each  factor  in  explaining  the  observed  variation,  provides  the  basis  for 
grouping  the  features  into  common  factors.  Each  feature  may  reasonably  be 
assigned  to  that  factor  in  which  it  has  the  highest  loading.  Where  loadings  of 
a  feature  in  two  factors  are  very  close,  the  feature  is  assigned  to  the  one 
judged  to  have  the  closest  affinity.  In  Table  II  clusters  of  features  with 
highest  factor  loadings  are  enclosed  in  rectangles.  Only  Factor  1  contains  a 
cluster  of  features. 

c.  From  an  examination  of  the  variables  in  each  cluster  it  appears 
reasonable  to  assign  attributes  to  each  factor.  Thus,  Factor  1  might  be  termed 
the  "Frequency  Factor.1' 

d.  The  analysis  described  is  an  exploratory  one  and,  therefore,  any 
conclusions  resulting  must  be  tentative.  However,  the  loadings  are  small  in 
Factor  5;  this  factor  may  be  eliminated  completely  by  changing  the  minimum 
eigenvalue  to  be  retained.  In  addition,  one  might  conclude  from  Factor  1  that 
only  one  of  the  frequency  features  is  heavily  loaded  on  the  same  factor.  This 
conclusion  is  supported  by  a  previous  study  conducted  by  the  authors. 

e.  The  generalized  variance  is  shown  in  Table  III  for  each  of  the 
five  factors.  More  than  82%  of  the  generalized  variance  can  be  attributed  to 
the  first  two  factors.  Thus,  it  appears  that  the  fractional  power  ratio  at 
2-2.5  MHz  and  the  total  power  from  0-3  MHz  are  not  strong  discriminators  since 
their  heaviest  loadings  are  on  factors  other  than  the  first  two. 

f.  The  correlation  coefficients  are  shown  in  Table  IV  where  10  dB 
down  bandwidth  is  shown  to  have  a  strong  negative  correlation  with  10  di  down 
mid-frequency,  -.928.  This  correlation  is  reflected  in  the  cluster  of  these 
two  features  in  Factor  1  of  the  rotated  factor  matrix  in  Table  II. 

g.  While  the  work  reported  here  is  preliminary  and  results  must  be 
treated  cautiously  because  of  small  sample  sizes,  it  appears  that  factor  analy¬ 
sis  is  a  powerful  tool  for  further  use  in  guiding  the  selection  of  features  from 
ultrasonic  waveforms.  This  capability  should  greatly  enhance  the  possibility 

of  selecting  features  in  the  most  efficient  manner  for  discrimination. 


45 


NAEC-92-140 


TABLE  II.  ROTATED  FACTOR  MATRIX 


FEATURE 

FACTOR  1 

FACTOR  2 

FACTOR  3 

FACTOR  4 

FACTOR  5 

10  dB  Down 

r"  1  i 

Mid-Frequency 

-.949 

.091 

1  170  1 

-.189 

.163 

1  OdB  Down. 

1  vUll  i/v  WH 

Bandwidth 

.941 

.058 

-.249 

.142 

.172 

Number  of  Peaks 

1 

t 

Above  20  dB 

-.022 

-.986 

.034 

.162 

i 

-.009 

Fractional  Power 

Ratio  2-2.5  MHz 

.249 

.013 

-.957 

.147 

.003 

Total  Power 

0-3  MHz 

j  .408 

-.498 

m 

o 

m 

< 

.701 

, 

-.001 

TABLE  III.  GENERALIZED  VARIANCE  FOR  FACTORS 


FACTOR  1  FACTOR  2  FACTOR  3  FACTOR  4  FACTOR  5 


Eigenvalues 

2.817 

1.295 

.677 

.157 

.055 

7  of  Generalized 

56% 

26% 

14% 

3% 

1% 

Variance 

Cumulative  % 
oi  Generalized 
Variance 

56% 

82% 

1 

i 

96% 

j 

99% 
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TABLE  IV. 

CORRELATION 

COEFFICIENTS 

NUMBER  OF 

FRACTIONAL 

TOTAL 

10  dB  DOWN 

10  dB  DOWN 

PEAKS 

POWER  RATIO 

POWER 

BANDWIDTH 

MID¬ 

ABOVE  20  dB 

2-2.5  MHz 

0-3  MHz 

FREQUENCY 

Number  of 
Peaks 

Above  20  dB 

1.000 

.586 

-.65 

-.096 

Fractional 
Power  Ratio 
2-2.5  MHz 

-.027 

1.000 

.491 

.495 

-.425 

Total  Power 
0-3  MHz 

.586 

.491 

.530 

-.617 

10  dB  Down 
Bandwidth 

-.065 

.495 

.530 

1.000 

-.928 

10  dB  Down 
Mid- 

Frequency 

-.096 

-.425 

-.617 

-.928 

1.000 
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V.  GANPU1  COMPONENTS 

A.  DATA  ACQUISITION.  Shown  in  Figure  19  is  a  block  diagram  of  the  bn  ;■  hard¬ 
ware  components  used  by  GANPl'i.  The  system  generally  has  a  capability  or'  nvu- 
rately  digitizing  signals  up  to  the  range  of  10  MHz.  All  functions  art 
controllable  either  manually  or  through  software.  Associated  with  the  video 
terminal,  but  not  shown,  is  a  hard  copy  unit;  that  is,  a  device  which  reproduces 
on  paper  the  contents  of  the  video  screen.  There  are  also  three  means  of  "soft" 
storage:  disk,  floppy  disk,  and  cassette. 

1.  The  ultrasonic  investigator  first  determines  the  appropriate  mode  of 
inspection,  either  contact  or  immersion.  If  he  chooses  immersion.  Software 
control  for  the  x-v  scanner  apparatus  is  available  to  him. 

2.  He  then  sets  such  variables  as  damping,  gain,  repetition  rate,  and  so 
forth  on  the  instrumentation  involved.  Once  he  is  satisfied  that  proper  signals 
are  being  obtained,  he  is  ready  to  use  CANPDI. 

3.  The  principal  data  acquisition  program  is  cal  Led  GETTER.  This  program 
allows  the  operator  to  choose  the  sampling  rate,  the  time  window,  and  number 
of  times  a  signal  is  to  be  averaged,  if  such  a  procedure  is  deemed  necessary. 

(he  operator  may  also  assign  a  name  to  the  data  he  is  acquiring.  Each  set  of 
data  is  called  a  frame.  Data  is  automatically  plotted  on  the  video  terminal  as 
it  is  obtained.  This  allows  the  operator  to  monitor  the  procedure  and  detect 
any  gross  .jitter  (time  shifting)  problems.  The  investigator  then  decides  if 
the  data  is  acceptable  for  further  processing  or  not,  his  decision  being  based 
on  quantum  levels  inv  lived,  noise  content,  etc.  Figure  20  shows  a  typical  for¬ 
mat  of  the  GETTER  output.  Acceptable  data  is  then  displayed  as  an  averaged 
wave form,  and  as  a  Fourier  transformed  signal.  The  information  pertinent  to 
the  experiment  is  stored  and  printed  on  the  second  page.  There  is  also  space 
available  for  comments,  both  the  averaged  waveform  and  its  spectrum  are  stored. 
If  the  operator  wishes  to  continue,  another  frame  is  available  to  him.  Upon 
completing  the  desired  sequence  of  data,  the  video  terminal  displays  the  names 

«.  f  the  files  where  the  data  is  stored.  Figures  21  through  23  show  typical  out¬ 
puts  . 

4.  When  jitter  becomes  a  significant  problem,  the  operator  may  employ  a 
.orrelat ion-detection  algorithm.  This  procedure  aligns  signals  according  to 
their  degree  of  correlation  with  each  other.  After  the  signals  are  aligned, 
thev  are  averaged. 

3.  The  GAM’!  I  system  also  includes  orocedures  for  determining  the  sensi¬ 
tivity  requirements  of  a  particular  inspection.  Testing  sensitivity  !s  deter¬ 
mined  by  the  minimum  size  of  a  flaw  which  must  be  detected  according  •  perti¬ 
nent  specif ieat  ion  or  other  engineering  requirements.  Ti  the  sensitivity  is 
too  low.  it  is  possible  to  miss  flaws  which  are  dangerous  for  the  siructn  - 
strength.  loo  much  sensitivity  causes  detection  of  the  great  amounts  of 
structural  i nhomogene i t i os  and  insignificant  flaws. 

f ' .  1‘ne  section  on  Signal  Processing  includes  descriptions  of  averaging 
and  correlation  detection. 

7.  The  program  that  retrieve.*  previously  acquired  data  for  further  p  re¬ 
cessing  to  evaluation  is  called  REPLAY.  The  input  to  this  program  is  : he 
name  of  the  file  containing  the  desired  data.  The*  operator  may  select  on  1  v 
those  frames  which  tie  feels  ate  useful  and  disregard  those  that  are  not  of 
interest.  See  Figures  2b  through.  33. 


m 


N 


Test  Specimen 


Figure  19.  Block  Diagram  of  a  Fast  Ultrasonic 

Data  Acquisition  and  Analysis  System 
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Figure  21.  Typical  Output  Getter  Frame 
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Figure  24.  Typical  Output  Cotter  Frame 
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Figure  26.  Blank  Frame 
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Figure  28.  Blank  Frame 


ccrr,E>i 


Figure  30.  Replay  Frame 
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B.  FEATURE  EXTRACTION.  Feature  extraction  is  the  process  whereby  original  and 
transformed  signals  are  parameterized  and  the  parameters  treated  as  components 
of  a  column-like  vector  called  a  feature  vector.  Figure  34  below  shows  how  a 
waveform  might  be  parameterized. 


NORMAL  WAVE  SHEAR  WAVE 


PEAK-TO-PEAK 


a.  Absolute  maximum  +  absolute  minimum  in  either  preceding  or  following 
half  cycle  in  0-4psec  window  (normal  component), 
h.  Absolute  maximum  +  absolute  minimum  in  either  preceding  or  following 
half  cycle  in  8-12psec  window  (shear  component), 
c.  Any  maxima  rising  above  the  6  dB  down  (50%)  level  +  absolute  minimum 
in  either  preceding  or  following  half  cycle. 


Aj  -  The  first  peak-to-peak  value  in  time  over  the  0-4psec  portion  of  the 
waveform.  A^  will  never  be  zero. 

A,  -  The  second  peak-to-peak  value  in  tme  over  the  0-4psec  portion  of  the 

waveform.  If  there  is  no  second  peak-to-peak  as  defined  above,  A 2  =  0. 

A-.  -  The  first  pe3k-to-peak  value  in  time  over  the  8-12)jsec  portion  of  the 

waveform.  If  no  peaks  appear  in  this  portion,  A^  =  0. 

A.  -  The  second  peak-to-peak  value  in  time  over  the  8-12psec  portion  of  the 

waveform.  If  no  peaks  appear  in  this  portion,  A^  =  0. 

T j  -  The  time  from  the  maximum  value  of  Aj  to  the  maximum  value  of  At. 

ii  -  The  time  from  the  maximum  value  of  A  j  to  the  maximum  value  of  A.4 . 


feature  vector 


A1 

A2 

a3 

A/. 


1.  All  feature 
feature  extraction, 
of  the  signal.  The 


extraction  is  implemented  in  software.  Prior  to  performing 
the  investigator  may  wish  to  examine  other  representations 
most  popular  is  its  Fourier  transform  including  both  the 
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power  spectrum  and  phase  angle.  Recently,  the  so-called  cepstrum  has  been 
found  to  be  useful  in  superposition  problems.  These  transforms  are  available 
in  GANPl’I  along  with  the  option  of  arbitrary  transforms  (operator  defined). 

2.  Some  work  requires  the  use  of  the  transfer  function  of  the  system  under 
investigation.  This  is  obtained  by  a  process  known  as  Deconvolution.  This 
process  may  also  be  used  to  compensate  for  changes  in  transducer  characteris¬ 
tics.  GANPIJI  has  a  deconvolving  algorithm  and  the  software  to  extract  features 
from  the  transfer  function. 

3.  In  those  studies  where  signal  amplitudes  are  hinder- ing  rather  than 
helping,  appropriate  normalizing  procedures  may  be  used.  Signals  may  be  nor¬ 
malized  to  have  peak-to-peak  values  of  one,  to  have  zero  mean  w:  :'r.  standard 
deviation  >f  one,  md  so  forth.  Transformed  data  may  have  to  tn-  normalized 
depending  on  the  nature  of  the  feature  extraction  involved. 

4.  This  net.  preprocessed  data  (transformed,  deconvolved,  normalized,  etc.) 
is  the  set  of  functions  on  which  feature  extraction  is  performed.  In  general, 
features  arc  of  two  broad  classes:  statistical  and  physically  motivated.  Sta¬ 
tistical  features  are  those  such  as  mean  value,  standard  deviation,  kurtosis, 
etc.  Thvsically  motivated  might  include  parameters  such  as  arrival  times, 
ratios  of  echo  amplitudes  to  a  reference  amplitude,  spectral  depression  spacing, 
etc.  There  are  also  two  methods  of  applying  these  concepts.  One  is  called 
Global  and  the  other  is  called  Local.  Global  methods  involve  the  analysis  of 
the  entire  preprocessed  function,  whereas  Local  methods  look  only  at  windowed 
portions  of  the  data.  An  example  of  Global  and  Local  feature  extraction  is 
shown  in  Figures  30-33.  The  feature  is  the  peak  frequency  of  the  Fourier 
spectrum. 

C.  VECTOR  FILING  SYSTEM.  When  ultrasonic  inspection  is  used  to  predict  com¬ 
ponent  performance  or  quality,  the  problem  is  considered  either  in  the  form  of 
discrete  classes  or  as  a  continuum.  For  example,  an  adhesive  bond  strength 
investigation  may  classify  bonds  as  good  or  bad  (2  class),  high,  medium,  or  .ov 
hr  caking  strength  (3  class),  or  continuously  by  actually  predicting  : .  break¬ 

ing  strength  in  lbs. /in.-.  Each  type  of  problem  requires  a  different  form  of 
algorithm.  1 ii.ee  different  algorithms  require  different  inputs,  feature  values 
(vectors)  must  be  filed  in  a  manner  appropriate  to  the  algorithm  employed. 

1.  Tl.e  general  approach  is  to  have  three  distinct  file  structures.  Ihe 
first  is  a  file  containing  only  single  features  over  the  entire  domain  of  the 
problem.  That  is,  a  sub-file  with  only  feature  1  values,  a  sub-file  with  only 
feature  2  values,  etc.  Algorithms  requiring  this  type  of  structure  .  PDF 
estimation,  2-space  plotting,  fuzzy  logic,  etc.  These  algoritlims  wilj  be 

exp  la  i.ned  separately. 

2.  Secondlv ,  a  file  containing  complete  vectors  is  necessarv.  Algorithm 
of  tfie  AI.N  /adoptive  learning  network)  type  require  vector  inputs. 

1.  i  .  :tly .  a  file  structure  based  on  class  restricted  vectors  is  required. 
The  fisher  1  incur  disci  'ninaut,  Minimum  Distance,  and  Factor  Analysis  algor ;  rl  ns 
use  class  restricted  vector  inputs. 


b'j 


NAFX-92- 140 


D.  PATTERN  RECOGNITION  PROCEDURES.  The  first  procedure  involves  the  estima¬ 
tion  of  probability  density  functions.  These  curves  graphically  indicate  the 
probability  that  a  feature  will  assume  a  particular  value.  A  narrow  unimodal 
distribution  could  indicate  a  poor  feature  since  over  all  classes,  only  one 
value  is  most  likely.  The  feature  would  have  no  merit  in  differentiating  be¬ 
tween  classes.  On  the  other  hand,  a  multi-modal  distribution  would  indicate 
a  potentially  good  feature  or  one  that  varies  hopefully  from  class  to  class. 
See  Figure  35. 


liJ 

.J 

< 


Figure  35.  Multi-Modal  Distribution 

1.  Completion  of  the  stage  might  possibly  give  insight  to  the  development 

of  a  fuzzy  type  algorithm.  This  type  of  reasoning  is  illustrated  in  section  IVA. 

2.  A  second  step  is  the  use  of  2-space  plots.  These  are  plots  of  feature 
I  versus  feature  J  for  each  pair  I,  J.  Plots  such  as  these  indicate  feature 
interactions  and  also  have  the  potential  to  define  distinct  clusters,  each 
cluster  defining  a  particular  class  in  the  problem  framework.  Simple  algorithms 
may  also  be  defined  on  the  curves.  Examples  from  a  Crack  versus  Geometry  study 
are  shown  in  Figures  36-38. 

3.  If  simple  algorithms  are  not  feasible  at  this  point,  more  sophisticated 
algorithms  must  he  used.  The  results  of  these  initial  analyses  may  be  used  to 
give  direction  to  use  of  the  higher  level  algorithms. 
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Figure  36.  Two  Space  Plot  Between  Number  of  Spectral  Depressions 
Above  20  dB  (XI)  and  10  dB  Down  Bandwidth  (X4) 
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Figure  38.  Two  Space  Plot  Between  Number  of  Spectral  Depressions 
Above  20  dB  (XI)  and  Fractional  Power  Ratio  (X2) 
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4.  Before  considering  complex  algorithms,  PDF  curves  may  be  used  again. 
This  time  the  curves  are  plotted  on  a  feature  by  class  basis.  An  example  of 
a  typical  plot  is  shown  in  Figure  39.  The  shaded  regions  indicate  where  values 
on  the  x-axis  would  indicate  a  high  bond.  Other  regions  are  most  likely  to 
contain  values  obtained  from  low-strength  bonds.  An  algorithm  may  possibly  be 
implemented  at  this  stage. 


Figure  39.  Use  of  PDF  Curves 

5.  Factor  Analysis  is  used  next.  This  is  a  statistical  procedure  for 
determining  those  features  which  contribute  most  to  variation  in  classes.  It 
Is  a  method  for  ordering  features  with  respect  to  their  degree  of  importance 
in  the  class  discrimination  problem.  See  the  section  on  Factor  Analysis  for 
a  detailed  description  of  this  method. 

6.  The  above  procedures  essentially  define  those  features  that  will  be 
useful.  Using  this  set  of  "good'-  features,  three  other  techniques  are  possible. 
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7.  A  minimum  distance  classifier  is  an  algorithm  requiring  two  sets  of 
statistically  similar  data.  One  set  is  used  for  training  and  the  other  for 
evaluation  of  the  algorithm.  The  first  set  is  called  the  prototype  set.  The 
evaluation  set  is  called  the  test  set.  Training  information  is  used  in  estab¬ 
lishing  the  prototype  vectors  along  with  the  minimum  distance  classification 
routine.  The  procedure  works  quite  simply  by  examining  a  distance  in  an  "n1 
dimensional  space,  "n"  being  the  number  of  elements  in  the  feature  vector.  A 
test  vector  is  compared  to  the  two  prototype  vectors  by  a  distance  formula. 

The  test  vector  is  classified  according  to  the  resulting  distance  measure 
which  classifies  according  to  the  prototype  it  is  closest  to.  The  formulas 
used  are  summarized  below. 


iTi 


-V 


(XT1  -  Xpll)  +  (xT2  -  Xp12)  +  .  .  .  +  (XTn  -  Xpln)' 


1 T2 


V 


(Xti  -  Xp2l)  +  ~  XP22^  +  •  •  •  +  (xTn  ~  ^P2n^ 


which  reduces  to 


dTN 


n  V 

s  (XTi  -  xPNn) 


i=l 


2 


N 

= 

prototype  number 

n 

= 

number 

of  elements 

in  the 

dTi 

= 

distance  between  the  test 

if 

dTl 

dT2 

then  class 

1 

if 

dT2 

t— ( 

H 

V 

then  class 

2 

feature  vector 

vector  and  the  prototype  vector  N 
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VI.  SAMPLE  PROBLEMS  USING  GANPUI 

A.  GENERAL  CONSIDERATIONS.  This  section  will  follow  the  paths  through  GANPL'l 
that  have  led  to  the  development  of  two  successful  algorithms.  The  problems 
to  be  discussed  are  intergranular  stress  corrosion  cracking  and  adhesive  bonds. 

1.  The  first  major  step  after  problem  definition  is  transducer  selection. 
This  involves  specification  of  frequency  ranges,  mode  of  usage,  immersion,  con¬ 
tact,  boot,  etc.,  the  choice  of  narrow  band  or  broad  band,  and  whether  or  not  a 
single  or  dual  element  probe  should  be  used.  The  inherent  noise  levels  of  the 
problem  are  a  major  factor  in  the  choice  of  these  parameters.  Low  noise  levels 
facilitate  the  use  of  transducer  compensation  routines.  The  transfer  function 
of  the  system  may  also  be  used  as  a  source  when  low  noise  levels  are  involved. 
Higher  feature  levels  indicate  that  tight  specifications  are  necessary  for 
transducers  that  are  different  from  the  design  transducer. 

2.  The  question  arises,  will  features  be  transducer  dependent  or  indepen¬ 
dent?  The  answers  to  this  type  of  question  establish  the  data  acquisition 
protocol . 


3.  The  next  step  depends  on  the  physics  and  mechanics  of  the  problem.  This 
is  the  choice  of  feature  sources  and  those  features  that  are  to  be  extracted. 
Questions  like,  is  superposition  possible?,  are  there  frequency  shifts  involved?, 
what  are  possible  attenuation  effects?,  etc .,  are  all  indicative  of  the  particu¬ 
lar  features  that  are  required. 

4.  Finally,  using  the  acquired  feature  vectors,  pattern  recognition  methods 
are  tried.  The  progress  is  from  simple  to  sophisticated,  simpler  solutions 
being  favored.  Once  several  algorithms  have  been  attempted,  the  trade-offs 
between  simplicity,  reliability,  and  economy  are  evaluated.  Then,  in  a  sense, 
the  optimum  scheme  is  implemented.  See  Appendices  and  references  1,  2,  and  3 
below  for  further  details. 

B .  SAMPLE  PROBLEMS . 


Bonds 

Cracks 

Mode 

immersion 

contact 

Frequency  rtange 

10  MHz 

1.5-3  MHz 

Type 

Single  element 

Dual  element 

Broad  band 

Narrow  band 

S  ignal 

Spatial  Averaging 

Signal  Averaging 

Processing 

Signal  Averaging 

— 

Noise  Level 

Low 

High 

Deconvolution 

— 

Feature  Sources 

Transfer  function 

Video  envelope 
Fourier  spectrum 

Features 

Pulse  Duration 
Partial  energy 
in  spectrum 

Algor i thm 

Fisher  linear 
d iscrim inant 

Two  space  plot 

Performance 

{It  correct/total) 

— 
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APPENDIX  A 

TABLE  A- 1 .  INPUT  DATA  TO  FACTOR  ANALYSIS  PROCRAM 


CASE 

FEATURE  1 

FEATURE  2 

FEATURE  3 

FEATURE  4 

FEATURE  5 

1 

1 

.052 

7.07! 

.781 

1.465 

T 

1 

.086 

7.387 

.879 

1.514 

3 

1 

.053 

7.552 

.879 

1.514 

4 

2 

.088 

7.319 

.781 

1.562 

5 

1 

.029 

6.780 

.684 

1.514 

6 

7 

.082 

12.401 

.781 

j 

1  .514 

7 

6 

.071 

10.103 

.537 

1.440 

8 

.079 

12.786 

1.660 

.879 

n 

.045 

11.900 

.684 

1.514 

10 

10 

.049 

9.460 

.684 

1.367 

TABLE  A-2.  MEANS  AND  STANDARD  DEVIATIONS 


FEATURE 

MEAN 

STANDARD  DEVIATION 

1 

Ln 

O 

o 

4.932 

2 

.062 

.01° 

3 

CN 

cc 

2.509 

4 

.923 

i  1 

.402 

5 

I 

1.360 

.258 
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APPENDIX  B  -  THE  UTILITY  OF  GRAPHICAL  AIDS 

A.  PATTERN  RECOGNITION.  Pat  tern  recognition  studies  often  involve  the  use  of 
spaces  with  high  dimensionality.  A  feature  vector  of  dimension  six  is  not 
uncommon.  One  way  to  graphically  display  such  a  vector  is  through  the  use  u! 
closed  polygons  of  equal  sides.  This,  of  course,  applies  to  vectors  of  diuiei: 
sion  three  or  greater,  an  equilateral  triangle  being  the  closed  polygon  with 
the  least  number  of  sides.  Consider  the  polygon  shown  in  I’xar.  ;  c  [.-I.  Each 
side  is  length  L.  Each  side  may  he  considered  the  range  of  a  •  . r  component 
if  the  component  were  normalized  to  span  the  interval  0,  1.  .  i'n.s  is  easilv 
accomplished  by  applying  the  following  formulation. 

Let  a  =  the  minimum  value  a  feature  can  assume 

Let  h  =  the  maximum  value  a  feature  can  assume 

Let  L  =  the  desired  range  to  be  spanned  by  the  feature 

‘lion  v<  t)  -  f —  J  .  !,  will  mao  feature  values  f  onto  the  range  0,  1,  . 

b  -  a 

Consider  the  sot  ...  features  fj,  fa,  and  13  with  corn  .spend  :ng  ranges 

-10  f !  S 

?5  f  2 

-30  ij  ;s 
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1.  It  is  noted  that  a  different  pattern  is  generated  for  a  different  vector. 
Also,  for  vectors  that  are  relatively  "close"  or  clustered,  the  generated  pat¬ 
terns  are  similar;  vectors  that  are  declustered  (not  in  one  particular  cluster 
but  in  another)  generate  different  patterns. 

2.  The  efficient  use  of  this  display  method  comes  when  dealing  with  higher 
dimension  vectors.  Shown  in  Example  B-4  is  an  example  for  a  six-d imensional 
vector . 

3r.  Parametric  plotting  may  also  be  useful  as  an  aid  to  pattern  recognition 
studies.  Illustration  of  this  technique  is  best  shown  by  example.  Three 
Fourier  Spectrums  are  shown  in  Example  B-5.  The  one  on  the  right  may  be  con¬ 
sidered  as  a  "reference"  spectrum.  The  remaining  two  spectrums  may  be  assumed 
to  be  derived  from  two  different  classes  of  time  functions.  At  each  frequency 
there  corresponds  a  value  on  the  reference  spectrum,  r(f),  and  values  on  the 
other  two  spectrums,  s^(f)  and  s->(f)  respectively.  Using  the  x-axis  as  a  r(f) 
axis  and  the  y-axis  for  r(f),  sj?f),  and  S2(f)  axes,  the  parametric  plots 
shown  in  Example  B-6  are  obtained. 

Another  technique  is  one  involving  the  use  of  so-called  "pie"  graphs, 
i  tils  method  is  generally  applied  to  curves  whose  areas  may  be  normalized  to 
one.  Fourier  spectrums  are  good  examples.  See  Example  B-7a. 

3.  This  unit  area  can  then  be  related  to  a  circle  or  "pie"  of  unit  area, 
Example  B-7b.  The  original  curve  can  be  sectioned  into  intervals  or  bands, 
each  of  equal  length .  See  Example  B-8a.  The  intervals  will  contain  certain 
percent  ages  of  the  total  area.  These  percentages  correspond  to  "slices”  of 
varying  size  within  the  "pie".  Example  B-8b  illustrates  this  concept.  Instea  1 
of  denoting  circle  sectors  by  numbers,  a  gray  scale  code  or  symbolic  patterns 
may  be  used.  Example  B-9  shows  a  typical  display  mode. 
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APPENDIX  C  -  (ABSTRACT)  A  PATTERN  RECOGNITION  REFLECTOR  CLASSIFICATION 
FEASIBILITY  STUDY  -  CRACK  (IGSCC)  VS.  GEOMETRIC  (CROWN) 

REFLECTOR  IN  304  STAINLESS  STEEL  PIPE  WELD  SPECIMENS 

A  feasibility  study  has  been  conducted  in  order  to  evaluate  the  potential 
of  pattern  recognition  techniques  for  discriminating  between  geometrical  and 
crack  reflector  signals  obtained  during  ultrasonic  inspection  of  the  weld  zone 
in  304  austenitic  stainless  steel  pipes.  A  geometrical  reflector  is  defined 
as  a  reflector  associated  with  the  weld  geometry  and/or  a  flaw  incapable  of 
causing  catastrophic  failure  e.g.  crown,  counterbore,  suck-back  drop-thru,  etc. 
Seven  welds  from  four  different  4"  diameter  pipe  specimens,  containing  inter¬ 
granular  stress  corrosion  cracking  (IGSCC)  were  examined  ultrasonically .  The 
ultrasonic  inspection  was  conducted  in  a  pulse  echo  inode  using  a  !.5  MHz  nominal 
center  frequency,  3/8"  diameter  transducer  mounted  on  a  plexiglass  shoe  with 
45°  refracted  transverse  wave  insonifying  the  area  of  interest.  The  ultrasonic 
data  was  correlated  with  the  dye  penetrant  tests  and  ult'asonic  examination  con¬ 
ducted  by  Southwest  Research  Institute  (SWR1)  in  order  to  obtain  valid  training 
information.  The  data  in  this  particular  feasibility  study  included  crown 
geometric  reflectors  and  crack  reflectors.  A  total  of  107  crown  Indications  and 
40  intergranular  stress  corrosion  cracking  indications  were  analyzed.  The 
analysis  did  not  consider  any  arrival  time,  amplitude  information  or,  in  fact, 
any  other  time  domain  features,  but  was  based  on  various  Fourier  transform 
features.  A  100%  reliability  level  was  obtained  for  discriminating  an  IGSCC 
indication  vs.  crown  indication  using  automated  pattern  recognition  algoritnm. 

The  overwhelming  success  of  the  pattern  recognition  algorithm  employed  in 
this  study  demonstrates  the  applicability  of  this  technique  for  solving  such 
important  problems  as  discrimination  between  IGSCC  vs.  geometric  reflectors  in 
304  stainless  steel  pipe  welds.  Work  on  other  kinds  of  geometric  reflectors 
is  in  progress  for  establishing  an  overall  reliability  level  in  reflector 
classification  analysis. 
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APPENDIX  D  -  (ABSTRACT)  "TRANSDUCER  COMPENSATION 
CONCEPTS  IN  FI, AW  CLASSIFICATION" 

by 

Joseph  L.  Rose,  Professor  of  Mechanical  Engineering 

and 

Michael  J.  Avioli,  Graduate  Student,  Mechanical  Engineering  and  Mechanics 


Flaw  classification  analysis  is  quite  often  strongly  influenced  by  the 
type  of  ultrasonic  waveform  that  is  generated  by  an  ultrasonic  transducer.  One 
goal  of  this  paper  is  to  introduce  procedures  that  could  possibly  make  flaw 
classification  algorithms  become  somewhat  independent  of  certain  ultrasonic 
waveform  characteristics  being  used  in  the  data  acquisition  procedure.  Data 
acquisition  of  ultrasonic  pulse  echo  signals  depends  quite  strongly  on  many 
test  system  characteristics ,  in  particular,  special  characteristics  of  the 
ultrasonic  transducer  and  pulser-receiver  instrument  characteristics.  A  trans¬ 
ducer  compensation  procedure  is  presented  in  this  work  that  requires  a  suitable 
reference  signal  containing  "noise1'  contributed  only  by  system  components 
external  to  the  unknown  flaw,  and  in  software,  a  processing  scheme  is  designed 
to  remove  external  effects,  therefore,  allowing  concentration  on  flaw  charac¬ 
teristics  contained  within  the  ultrasonic  signal. 

The  processing  scheme  has  four  general  components:  Acceptor,  Compensator, 
Comparator,  and  Evaluator.  The  acceptor  is  basically  a  gate  that  decides 
whether  or  not  a  particular  transducer  is  usable  for  the  problem  at  hand.  The 
compensator  implements  a  mathematical  deconvolution  process.  The  comparator 
does  a  feature  by  feature  similarity  check  on  the  desired  signal  and  the  com¬ 
pensated  signal.  The  evaluator  is  any  scheme  that  can  determine  the  performance 
of  a  given  transducer.  In  particular,  the  algorithm  under  study  may  be  used  to 
evaluate  transducer  performance.  The  evaluation  stage  is  followed  by  an  exam¬ 
ination  of  the  comparator  results.  Tolerances  relating  to  the  acceptability  cf 
a  transducer  are  obtained  through  this  final  stage. 

Model  analysis  is  used  to  study  the  compensation  problem.  A  Layered  Model 
is  used  with  various  levels  of  system  "noise"  being  introduced,  in  order  to 
examine  the  "noise"  effects  in  the  deconvolution  computation  process.  Promise 
for  attaining  success  in  this  difficult  compensation  problem  is  good,  particu¬ 
larly  when  considering  signal  averaging  as  a  signal  processing  tool. 
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APPENDIX  E  -  (ABSTRACT)  THE  FISHER  LINEAR  DISCRIMINANT  FUNCTION 
FOR  ADHESIVE  BOND  STRENGTH  PREDICTION 

An  ultrasonic  inspection  system  for  the  prediction  of  adhesive  bond  strength 
for  metal-to-metal  applications  is  of  great  value  to  many  government  and  indus¬ 
trial  agencies  throughout  the  world.  The  prediction  of  adhesive  bond  strength 
based  on  surface  preparation,  assuming  that  there  are  no  delaminations,  inclu¬ 
sions,  or  such  cohesive  type  problems  as  improper  curing,  etc.  is  the  goal  of 
this  study.  Ultrasonically  evaluating  adhesive  bonds  that  have  partially  de¬ 
laminated,  is  generally  easily  accomplished  by  using  C-scan  techniques,  but  a 
major  problem  arises  when  the  deficiency  in  the  bond  is  either  adhesive  or  co¬ 
hesive  in  nature.  Our  study  involved  primarily  the  adhesive  aspect  of  the  bond 
strength,  which  is  related  to  the  surface  preparation  problem.  Test  specimens 
were  manufactured  so  that  an  improper  surface  preparation  occurred  on  either  or 
both  substrates  in  an  aluminum-to -aluminum  step-lap  joint.  The  specimens  with 
little  or  no  surface  preparation  provided  weak  bonds  and  the  specimens  with 
proper  surface  preparations,  in  general,  produced  strong  bonds. 

A  resource  base  developed  in  earlier  years  in  experimental  technology, 
theoretical  ultrasonic  wave  interaction  studies  with  adhesive  bond  models, 
manufacturing  technology,  and  shear  stress  distribution  analysis  have  all  been 
incorporated  into  a  pattern  recognition  program  of  study.  Such  topics  as  near¬ 
est  neighbor  philosophy,  fuzzy  logic  analysis,  probability  density  function 
analysis,  and  adaptive  search  and  learning  techniques  for  linear  and  non-linear 
models  have  been  investigated.  A  Fisher  Linear  Discriminant  algorithm  has  been 
developed  which  affords  a  91%  reliable  prediction  for  adhesive  bond  strength. 
Unfortunately,  results  indicate  that  the  prediction  algorithms  depend  strongly 
on  the  particular  transducer  which  was  initially  used  for  data  acquisition. 

Data  acquired  with  a  second  transducer,  having  different  pulse  form  character¬ 
istics,  in  general,  did  not  provide  reliable  results  for  predicting  adhesive 
bond  strength.  To  compensate  for  the  transducer  differences,  a  deconvolution 
technique  was  implemented  to  expand  the  selection  of  useful  transducers.  Limi¬ 
ted  success  on  this  technique  has  been  obtained  to  date  because  of  the  inherent 
system  noise  in  ultrasonic  data  acquisition  equipment. 

A  completely  automated  ultrasonic  inspection  system  has  been  developed  for 
predicting  bond  strength  in  metal-to-metal  adhesively  bonded  step-lap  joints. 
Results  to  date  provide  a  91%  reliability  for  solving  this  difficult  problem 
of  predicting  adhesive  bond  performance. 
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APPENDIX  F  -  ON  THE  UTILITY  OF  PROBABILITY 
DENSITY  FUNCTION  ANALYSIS 

Probability  density  function  curves  can  be  useful  for  solving  a  large  num¬ 
ber  of  engineering  problems.  Of  primary  concern,  of  course,  in  pattern 
recognition,  probability  density  functions  can  be  used  for  feature  selection. 
Pattern  recognition  problems  call  for  the  establishement  of  a  feature  vector 
that  can  be  used  for  developing  a  reflector  classification  algorithm.  In 
addition  to  this  very  important  application  of  PDF  curves  for  feature  evalu¬ 
ation,  a  variety  of  other  applications  in  engineering  is  being  considered  today. 
The  principles  of  probability  density  function  analysis  are  particularly  suited 
to  an  inspection  philosophy  for  composite  materials.  A  brief  review  of  possi¬ 
ble  applications  is  outlined  below. 

1.  To  evaluate  Material  Uniformity  in  a  Quality  Control  Test  -  Composite 
materials,  because  of  tire  dual  material  content,  variation  in  fabrication,  and 
an  isotropic  character,  etc.,  are  noisy  with  respect  to  ultrasonic  waveform 
content  as  reflected  from  the  composite  structure.  A  good  composite  material 
will  have  a  PDF  curve  for  a  particular  feature  that  is  fairly  tight.  Experi¬ 
mental  analysis,  of  course,  can  acquire  this  PDF  information  or  "PDF  signature". 
Uniformity  of  the  composite  material  can,  therefore,  be  evaluated  since  poorly 
manufactured  composites  would  have  a  different  PDF  signature  and  most  probably 
producing  that  of  a  wider  and  distorted  PDF  curve.  A  material  acceptance  cri¬ 
teria  could,  therefore,  be  written  as  a  function  of  tolerances  on  the  PDF  curves. 

2.  Material  Selection  Philosophy  for  Improved  Inspectability  -  Quite 
often  a  number  of  fabrication  techniques  are  considered  in  the  development  of 
a  composite  material.  As  strength,  temperature,  and  moisture  tests  indicate 
that  the  performance  of  the  composite  is  independent  of  the  fabrication  process, 
it  is  proposed  to  select  the  fabrication  process  with  the  greatest  inspectability. 
Inspectability  of  the  material  can  be  established  on  the  basis  of  PDF  signatures 
for  the  composite  materials  and  their  corresponding  fabrication  process.  Again, 

a  tightly  grouped  PDF  curve  for  certain  features  is  highly  desirable. 

3.  Materia]  Lay  Up  Selection  -  Composite  materials  can  be  laid  up  at  a 
variety  of  fiber  orientations  and  lay  ups.  In  some  cases,  the  angle  ply  lay  up 
procedures  are  important  for  improved  composite  material  performance.  On  the 
other  hand,  performance  may  not  be  improved.  In  this  case,  it  is  suggested 
that  PDF  signatures  for  the  various  composite  material  angle  ply  configurations 
be  used  to  select  the  lay  up  that  is  most  respectable,  again  following  some  of 
the  logic  developed  earlier  for  tightly  grouped  PDF  curves. 

A.  Transducer  Selection  -  Probability  density  function  curves  can  even 
he  used  to  select  transducers  for  material  inspection.  As  an  example  n  com¬ 
parison  of  single  element  versus  dual  element  transducer  application  for  a 
composite  material  can  be  evaluated.  Dual  element  transducer  work  will  have 
composite  material  inspection  because  of  the  removal  of  back  scatter  ultrasonic 
radiation.  This  physical  principle  of  recording  only  for  ard  scatter  informa¬ 
tion  can  be  demonstrated  quite  nicely  by  examining  a  numoer  of  features  in  a 
probability  density  function  analysis.  Nicely  distributed  and  tight  PDF  curves 
could  tie  used  to  select  the  best  transducer  for  a  particular  application. 
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5.  Damage  Evaluation  -  It  is  quite  obvious  that  once  a  PDF  signature  is 
acquired  for  a  composite  material  that  a  PDF  curve  tiiat  is  produced  at  some 
later  date  that  indicates  some  marked  change  is  indicative  of  material  change 
or  degradation.  In  most  cases,  the  change  would  come  about  because  of  era.  k, 
delamination,  or  environmental  degradation.  PDF  curves  produced  at  various 
areas  of  a  composite  material  can,  therefore,  be  used  to  provide  us  with  dam¬ 
age  propagation  information. 
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